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Effects  of  Increasing  Information  Processing  Demands  on  Rating 

Outcomes 


Abstract 

This  research  investigated  the  cognitive  processes  which  mediate 
the  performance  rating  process.  Specifically,  level  of 
processing  and  ratee  prior  performance  Information  were 
manipulated  in  a  3  X  3  factorial  design  in  order  to  assess  the 
impact  on  psychometric  rating  outcomes  and  rating  accuracy. 
Results  indicated  that  as  information  processing  demands 
increased,  raters  relied  more  on  the  past  performance  cues. 
Specifically,  raters  using  more  automatic  processing  and 
receiving  a  good  performance  cue  gave  more  lenient  ratings,  and 
those  using  automatic  processing  and  receiving  a  poor  performance 
cue  exhibited  Increased  halo.  In  addition,  raters  were  least 
accurate  in  recognizing  behaviors  consi3tent  with  their 


performance  cue.  Implications  for  future  research  in  performance 
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Performance  appraisal  research  Is  in  a  state  of  transition. 
For  the  last  two  decades,  researchers  have  focused  on  basically 
three  strategies  for  increasing  rating  validity:  a)  redesigning 
rating  formats,  b)  training  raters  to  minimize  errors,  and  c) 
Increasing  observation  skills.  Empirical  studies  of  the  success 
of  these  approaches  have  shown  some  decrease  in  rating  errors 
such  as  halo,  but  no  corresponding  Increase  in  accuracy.  In  an 
effort  to  understand  why  these  approaches  have  failed,  and  to 
gain  greater  insights  into  the  determinants  of  rating  accuracy, 
researchers  have  stressed  the  need  for  a  new  approach.  This  new 
approach  focuses  on  analyzing  the  process  underlying  performance 
ratings  (DeNlsl,  Cafferty,  4  Megllno,  1984  ,  Feldman,  1981,  Ilgen 
4  Feldman ,  1983). 

Rating  process  research  focuses  on  the  rater’s  selection, 
storage,  retrieval,  and  evaluation  of  information  during  the 
rating  task.  Many  studies  using  this  cognitive  approach  have 
appeared  recently  (for  example,  Banks,  1985;  Murphy  4  Balzer, 
1986;  Murphy,  Balzer,  Lockhart,  4  Eisenman,  1985).  In  the 
typical  rating  process  experiment,  stimuli  are  presented  in  a 
manner  that  is  relatively  non-taxing  of  Information  processing 
capabilities.  The  sole  task  of  the  participants  is  to  rate  the 
performance  of  a  target  person  (McIntyre,  Smith,  4  Bassett,  1984; 
Murphy,  Martin,  4  Garcia,  1982;  Pulakos,  1984).  In  spite  of  the 
non-taxing  nature  of  most  experiments,  recent  research  has  shown 
that  the  processing  of  performance  information  can  be  biased  by 
performance  cues,  by  initial  impressions,  and  by  prior  formal 
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judgments  of  the  stimulus  target.  It  Is  generally  considered 
that  these  manipulations  Instill  a  schema  that  biases  the  rater's 
processing  of  subsequently  seen  Information.  (Balzer,  1986; 

DeNisi  et  al,  1984;  Lord,  1985). 

While  these  studies  have  demonstrated  the  potential  of  this 
cognitive  approach  for  performance  appraisal,  Banks  and  Hurphy 
(1985)  have  recently  argued  that  this  line  of  research  is  likely 
to  widen  the  gap  between  research  and  practice.  To  prevent  this 
from  happening,  it  is  essential  for  future  research  in 
performance  appraisal  to  Insure  that  the  cognitive  processes 
captured  in  the  laboratory  are  similar  to  those  in  organizational 
settings.  Thus,  future  laboratory  studies  should  incorporate 
contextual  variables,  suoh  as  competing  tasks,  time  pressures  and 
delays  between  observing  ratee  behavior  and  appraisals,  to 
enhance  their  external  validity. 

As  noted  by  Feldman  (1981),  information  about  other 
organizational  members  generally  ocours  in  complex  and  noisy 
Informational  environments.  Supervisors  are  often  simultaneously 
exposed  to  the  behaviors  of  several  individuals,  complex  task 
information  and  their  own  thoughts  and  memories.  From  these 
multiple  sources,  they  must  select  relevant  information  and 
organize  it  into  patterns  that  can  be  understood  and  remembered. 
Given  our  limited  information  oapacity,  selective  attention  in 
most  organization  environments  is  determined  by  automatic 
processes.  Posner  (1982),  in  reviewing  work  on  attention  and 
performance,  notes  that  people  oan  manage  multiple  tasks  because 
they  used  automatic  processes  to  simultaneously  monitor  several 
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Informational  sources  with  almost  no  Interference.  Re  points  out 
that  it  is  only  when  we  "take  notice  of  targets"  (use  controlled 
processes)  that  our  capacity  is  severely  limited  and  information 
sources  interfere  with  one  another. 

During  automatic  processing,  the  information  that  is  noticed 
is  highly  dependent  on  salient  stimulus  characteristics  (e.g., 
loudness  or  uniqueness,  Taylor  &  Fiske,  1978)  as  well  as  the 
cognitive  schema  guiding  perception.  In  turn,  the  availability 
of  schemata  to  guide  information  processing  depends  on  primes 
such  as  past  employee  performance  or  the  goals  of  the  percelver 
(Foti  4  Lord,  1987;  Lord,  1985).  There  has  been  a  debate  in  the 
current  literature  as  to  whether  schematic  processing  causes  a 
biased  search  for  either  confirming  behaviors  or  inconsistent 
behaviors  (Balzer,  1986;  Murphy  et  al . ,  1985).  However,  under 
conditions  which  tax  processing  capabilities,  it  is  more  likely 
that  raters  will  note  schema  inconsistent  behaviors.  For 
example,  White  and  Carlston  (1983)  bad  subjects  listen  to  two 
conversations  simultaneously.  Subjects  were  given  prior 
personality  information  about  one  of  the  target  stimuli.  They 
found  that  subjects  spent  more  time  monitoring  the  conversation 
involving  the  target  about  which  they  had  no  prior  Information. 

In  addition,  subjects  tended  to  swltoh  their  attention  to  the 
primed  target's  conversation  when  a  schema  inconsistent  behavior 
occurred.  These  findings  support  Feldman's  (1981)  notion  that 
supervisors  will  automatically  process  subordinates'  performance 
until  an  atypical  behavior  causes  the  supervisor  to  move  to  a 
controlled  level  of  processing. 


1 

$8 


L«K 

«sS 

lisa 

39 


Performance  Ratings 


The  purpose  of  the  present  study  was  to  assess  the  effect  of 
increasing  information  processing  demands  on  psychometric  rating 
outcomes  as  well  as  rating  accuracy.  As  processing  demands 
increase,  participants  should  rely  more  on  automatic  processing 
of  performance.  Since  processing  done  automatically  is  highly 
influenced  by  the  schema  guiding  perception,  we  hypothesized  that 
performance  cues  presented  prior  to  observation  would  have 
stronger  impact  as  processing  demands  increased.  More 
specifically,  the  present  study  utilized  a  3  (level  of 
processing)  x  3  (good,  poor  or  no  performance  information)  design 
and  predicted  that  as  processing  demands  increased:  (1) 
participants  given  a  good  performance  cue  would  provide 
increasingly  lenient  ratings  and  those  given  a  poor  performance 
cue  would  provide  increasingly  strict  ratings;  (2)  participants 
would  rely  more  on  prior  performance  cues,  resulting  in  less 
differentiation  of  performance  dimensions  and  thereby  more  halo; 
and  (3)  participants  would  be  less  accurate  in  recognizing 
behaviors  consistent  with  their  performance  cue. 


Me  thod 


Subjects 


Participants  in  the  study  were  1^5  introductory  psychology 
students,  73  males  and  72  females,  with  a  median  age  of  19  years. 
Participants  were  randomly  assigned  to  experimental  conditions 
with  the  stipulation  that  both  males  and  females  be  equally 
distributed  across  conditions. 
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Stimulus  Material 

A  15-minute  color  videotape  of  an  instructor  lecturing  on 
the  topic  area  of  consumer  psychology  served  as  the  stimulus 
material.  The  tape  was  developed  and  used  in  a  previous  study  by 
Hauensteln  and  Alexander  (1986).  Embedded  in  the  tape  were  16 
behavioral  incidents  representative  of  four  performance 
dimensions:  four  good  behaviors  representing  the  dimension 

organization,  four  good  behaviors  corresponding  to  depth  of 
knowledge  and  two  good  and  two  poor  behaviors  for  each  dimension 
of  delivery  and  relevance.  This  videotape  was  used  because 
previous  research  indicated  that  it  represented  average 
performance,  thereby  avoiding  a  ceiling  effect  problem  in  testing 
the  leniency/severity  hypothesis. 

Procedure 

Participants  reported  to  the  lab  in  groups  ranging  from  five 
to  ten  persons.  All  particlrants  were  told  that  they  were  about 
to  watch  a  videotape  of  a  brief  lecture  after  which  they  would  be 
asked  to  rate  the  performance  of  the  instructor.  Prior  to 
viewing  the  videotape,  participants  were  given  written 


instructions  containing  the  performance  cue  manipulation  and  the 
rating  dimensions.  After  viewing  the  tape,  participants 
completed  a  short  filler  task,  the  P icture- Number  Test  (Ekstrom, 
French,  Harmon,  &  Derman,  1976)  to  eliminate  the  effects  of  short 
term  memory,  and  then  completed  the  rating  form  and  a  recognition 
memory  questionnaire. 
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Experimental  Manipulation 

Level  of  processing.  Three  levels  of  this  factor  were 
manipulated,  designed  to  create  a  continuum  from  controlled  to 
automatic  processing.  In  the  edited  condition,  the  instructor’s 
behaviors  were  grouped  (by  editing  the  videotape)  according  to 
the  four  performance  dimensions.  This  manipulation  was  designed 
to  create  extremely  controlled  processing  because  it  eliminated 
for  subjects  the  task  of  deciding  which  behaviors  were  most 
relevant  for  each  performance  dimension.  In  the  tape  condition, 
subjects  were  shown  the  videotape  in  normal  order,  similar  to 
other  laboratory  studies  of  performance  appraisal.  Again,  this 
manipulation  was  designed  to  create  controlled  processing 
although  not  as  extreme  as  in  the  previous  condition.  In  the 
task  condition,  participants  were  shown  the  videotape  and  also 
completed  an  additional  task  which  involved  thinking  up  at  least 
10  uses  for  two  common  objects.  This  manipulation  was  designed 
to  create  more  automatic  processing,  since  participants' 
attention  was  divided  between  the  two  tasks. 

Performance  cue.  This  factor  was  manipulated  by  written 
instructions.  In  each  of  the  level  of  processing  conditions 
described  above,  participants  were  either  given  a  paragraph 
describing  the  past  performance  of  the  instructor  as  good,  poor 
or  they  received  no  performance  information. 

Depend ent  Variables 

Rating  scales.  The  instructor's  performance  was  evaluated 
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using  five  7-point  graphic  rating  scales  with  anchors  of  poor  and 
excellent.  The  five  scales  consisted  of  the  four  dimensions 
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embedded  In  the  videotape  and  one  dimension  measuring  an  overall 
evaluation.  Leniency  was  operationalized  as  simply  the  average 
dimension  rating.  Mean  rating  differences  between  the  dimensions 
were  expected  because  poor  behaviors  occurred  on  only  two 
performance  dimensions.  Therefore,  while  halo  was 
operationalized  as  participant’s  standard  deviation  across  all 
four  performance  dimensions  (Saal,  Downey,  &  Lahey,  1980), 
subjects'  ratings  were  converted  to  standard  scores  within  each 
dimension  prior  to  the  computation  of  the  halo  scores  (Pulakos, 
Schmitt,  and  Ostroff,  1986).  The  less  the  dispersion  across  the 
dimension,  the  smaller  the  standard  deviation  and  the  stronger 
the  halo  effect  . 

Recognition  memory  questionnaire.  Recognition  memory  for 
whether  specific  instances  of  behavior  were  exhibited  by  the 
instructor  was  measured  by  a  32  item  questionnaire.  Eight  items 
pertaining  to  each  of  the  four  dimensions  represented  on  the 
videotape  were  included.  Within  each  subset  of  eight  items,  four 
of  the  behaviors  had  appeared  on  the  videotape,  while  the  other 
four  had  not  been  exhibited  by  the  instructor.  Because  more  good 
than  poor  behaviors  occurred  in  the  stimulus  tape,  the 
questionnaire  contained  2 4  good  behaviors  and  8  poor  behaviors. 
Recognition  accuracy  for  both  good  and  poor  behaviors  was 
measured  by  the  following  formula:  Number  of  true  positives  plus 
number  of  true  negative  divided  by  the  total  number  of  behaviors. 
True  hits  refer  to  the  number  of  occurring  behaviors  correctly 
Identified  and  true  negatives  refer  to  the  number  of  non- 
occurring  behaviors  correctly  identified. 
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Manipulation  checks.  A  check  for  each  experimental  factor 
was  included  on  a  final  questionnaire.  Participants'  perceptions 
of  the  amount  of  time  their  attention  was  focused  on  watching  the 
instructor  and  the  instructor's  previous  performance  were  each 
assessed  by  two  5-point  Likert  scale  items. 

Results 


Manipulation  Checks 

A  3  (level  of  processing)  X  3  (performance  cue)  analysis  of 
variance  (ANOVA)  was  used  to  assess  the  impact  of  the 
experimental  manipulations  on  the  participants'  questionnaire 
responses.  First,  as  expected,  participants  who  were  given  an 
additional  task  to  complete  reported  that  they  spent  less  time 
watching  the  instructor,  (M  =  2;35),  than  did  participants  who 
only  watched  the  tape,  either  in  normal  order  (M  =  3*11)  or 
edited  (M  =  3.59),  F(2,  136)  =  23.51,  £<.001.  Second, 
participants  in  the  good  performance  conditions  rated  their 
expectations  for  the  instructor's  performance  significantly 
higher  (M  =  4.06)  than  did  participants  in  the  poor  (M  =  1.15)  or 
no  performance  information  (M  =  2.92)  conditions,  F(2,  136)  = 
285.41 ,  £< .001 . 

Performance  Ratings 

Leniency .  Since  the  experimental  manipulations  were 
successful,  we  can  examine  their  effects  on  leniency.  Hypothesis 
1  predicted  an  interaction  between  level  of  processing  and 
performance  cue,  such  that  participants  using  automatic 
processing  (i.e.,  the  task  condition)  and  receiving  the  good 
performance  cue  would  evaluate  the  instructor  most  leniently,  and 
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those  receiving  the  poor  performance  cue  would  evaluate  the 
instructor  more  strictly.  Results  of  a  3  X  3  multivariate 
analysis  of  variance  (MANOVA)  using  all  five  performance 
dimensions  as  dependent  variables  provided  strong  support  for 
this  hypothesis,  £  approx(40,  502)  =  2.45,  £<.01. 


Insert  Table  1  about  here 


As  can  be  seen  in  the  ANOVAS  in  Table  1 ,  the  level  of  processing 
X  performance  cue  interaction  was  significant  for  three  of  the 
four  Individual  dimensions  as  well  as  the  overall  rating.  A 
priori  comparisons  were  performed  on  the  good  versus  poor  cell 
means  (see  Table  2). 


Insert  Table  2  about  here 


The  increasing  magnitude  of  these  deviations  clearly  showed  that 
the  performance  cue  manipulation  had  the  least  impact  in  the 
edited  condition  and  the  most  Impact  in  the  task  condition. 
Therefore,  as  predicted  there  was  greater  reliance  on  past 
performance  information  as  information  processing  demands 
increased.  Organization  was  the  only  dimension  where  this  trend 
was  not  seen.  The  most  likely  explanation  for  this  discrepancy 
was  salience  induced  by  the  timing  and  behavioral  content  of  this 
dimension.  The  critical  behaviors  for  organization  occurred 


either  near  the  beginning  or  end  of  the  videotape  and  typically 
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involved  the  lecturer  moving  from  the  lectern  to  the  chalk  board 
to  refer  to  an  outline  of  the  lecture. 

Halo .  Hypothesis  2  predicted  an  ordinal  interaction  between 
level  of  processing  and  the  performance  cue  due  to  the  inclusion 
of  a  no  performance  cue  condition  in  the  design.  Across  all 
levels  of  processing  demands,  the  subjects  receiving  no 
performance  information  were  expected  to  sample  both  good  and 
poor  behaviors  leading  to  greater  differentiation  across 
dimensions.  For  those  subjects  receiving  prior  performance 
Information,  the  increase  in  processing  demands  would  cause 
greater  reliance  on  the  performance  cue  leading  to  less 
differentiation  across  dimensions.  A  3  (level  of  processing)  X  3 
(performance  cue)  ANOVA  was  performed  on  the  halo  scores.  Only 
the  level  of  processing  main  effect  was  significant,  F  (2,  1 3 6 )  = 
3-36,  £<.05.  Halo  tended  to  become  stronger  as  processing  demands 
increased  across  all  levels  of  the  performance  cue  condition 
(edited  M  =  .77,  tape  M  =  .68,  and  task  M  =  .63).  These  results 
suggest  that  there  is  a  general  tendency  for  halo  strength  to 
increase  as  processing  demands  during  observation  of  performance 
increases,  regardless  of  the  schema  used  to  aid  the  processing. 
Recognition  Memory 

We  expected  that  as  information  processing  demands 
increased,  participants  would  be  least  accurate  in  recognizing 
behaviors  consistent  with  their  performance  cue.  This  would 
occur  because  raters  using  automatic  processing  would  rely  on  a 
preexisting  schema  activated  by  the  performance  cue  to  process 
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the  information  and  during  retrieval  would  have  trouble  saying 
"no"  to  schema  consistent  behaviors  that  had  not  occurred. 

Good  behavior  accuracy  scores  were  entered  in  a  3  (level  of 
processing)  X  3  (type  of  performance  cue)  ANOVA.  The  expected 
interaction  between  level  of  processing  and  performance  cue  did 
not  occur;  instead  two  strong  main  effects  were  found  (see  Table 


Insert  Table  3  about  here 


First,  increasing  processing  demands  caused  a  large  decrease 

p 

(eta  =  .37)  in  accuracy  (edited  M  =  .83,  tape  M  =  .76,  task  M  = 

2 

.69).  Second,  the  strong  performance  cue  effect  (eta  =  .19) 
was  due  to  subjects  receiving  the  good  cue  (M  =  .70)  being  less 
accurate  than  subjects  in  poor  cue  (M  =  .78)  and  no  cue  (M  =  .80) 
conditions.  As  expected,  examining  the  cell  means  in  Table  3 
clearly  showed  that  the  low  true  negative  rates  were  causing  the 
lower  accuracy  for  subjects  in  the  good  cue  condition.  Subjects 
receiving  the  good  cue  were  more  likely  to  overestimate  the 
frequency  of  good  behaviors.  The  reason  the  interaction  did  not 
occur  was  because  good  cue  subjects  in  the  edited  condition  were 
just  as  likely  to  overestimate  the  frequency  of  good  behavior  as 
subjects  in  the  more  taxing  tape  and  task  conditions.  Unlike  the 
leniency  results  for  the  actual  ratings,  the  performance  cue 
biased  recognition  accuracy  even  in  the  least  taxing  condition. 

Results  for  the  recognition  accuracy  for  poor  behaviors  were 
problematic.  The  same  analysis  as  used  for  the  good  behaviors 
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resulted  in  no  significant  effects.  As  can  be  seen  in  Table  3, 
accuracy  for  poor  performance  was  close  to  chance  due  to  the  low 
true  negative  rates.  The  commitment  of  many  false  positives 
could  have  been  due  to  a  strong  negative  impression.  However, 
the  performance  ratings  for  the  control  group  indicated  an 
average  impression.  Closer  examination  of  response  patterns  to 
the  poor  behavior  foil  items  suggested  that  problem  was  due  to 
the  foil  items.  For  example,  one  of  the  poor  foil  items  for 
delivery  was:  Instructor  used  a  lot  of  "uhms".  Careful 
examination  of  the  stimulus  tape  had  shown  the  instructor  as 
saying  "uhm"  only  once.  In  the  context  of  a  fifteen  minute 
lecture,  apparently  that  was  perceived  as  "a  lot"  of  uhms.  Three 
of  the  four  poor  foil  items  were  subject  to  this  type  of 
problem.1  Given  the  small  total  number  of  poor  behaviors,  this 
problem  prevents  an  adequate  test  of  our  hypothesis. 

Discussion 

The  present  study  was  designed  to  examine  the  cognitive 
processes  which  mediate  the  performance  rating  process. 
Specifically,  we  manipulated  the  level  of  information  processing 
demands  on  raters  as  well  as  varied  ratee  past  performance 
information.  Since  processing  done  automatically  is  heavily 
influenced  by  the  schema  guiding  perception,  we  expected  that 
past  performance  information  would  be  utilized  more  by  raters  as 
processing  demands  increased. 

The  results  showed  strong  support  for  our  first  prediction 
concerning  the  leniency  of  evaluations.  Subjects  using  automatic 
processing  were  influenced  the  most  by  prior  performance 
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Information.  These  results  suggest  that  the  initial 
categorization  of  the  ratee  by  the  supervisor  is  critical.  Given 
that  such  consistent  effects  were  found  when  using  only  one 
competing  task,  the  effects  of  the  supervisor's  initial 
categorization  would  probably  be  much  stronger  under  the  typical 
high  processing  demands  of  day  to  day  organizational  life. 

The  results  of  the  halo  analysis  suggest  that  under  more 
automatic  levels  of  processing  raters  demonstrate  a  stronger 
halo  effect,  regardless  of  the  schema  used  to  aid  in  processing 
the  performance  information.  Murphy  and  Balzer  (1986)  found 
similar  results  when  processing  demands  were  taxed  by  a  time 
delay.  They  suggested  that  systematic  distortions  by  the  raters 
caused  the  increase  in  halo.  The  most  likely  explanation  of  our 
finding  is  that  increasing  processing  demands  caused  an 
undersampling  of  specific  behaviors  leading  to  increased  reliance 
on  the  general  impression  (c.f.,  Cooper,  1981,  p.  220).  If  there 
is  a  linear  relationship  (up  to  the  point  that  processing  demands 
are  overwhelming)  between  processing  demands  and  halo  strength  it 
raises  questions  concerning  the  notion  that  halo  and  accuracy  are 
positively  related  (Cooper,  1981,  Murphy  &  Balzer,  1986).  It  may 
be  that  this  relationship  does  not  generalize  from  the  laboratory 
to  the  organization.  If  increasing  processing  demands  increases 
halo  strength,  then  halo  in  organizational  performance  appraisals 
could  greatly  overestimate  the  relationships  among  dimensions, 
possibly  to  the  point  that  accuracy  is  no  longer  related  to  halo. 

The  recognition  results  for  good  behaviors  raised  some 
Interesting  issues.  Even  under  the  most  nontaxing  conditions, 
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subjects  were  less  accurate  at  recognizing  behaviors  consistent 
with  the  performance  cue.  In  contrast,  for  more  categorical 
level  judgments  (i.e.,  performance  ratings)  the  performance  cue 
had  little  impact  in  the  nontaxing  condition.  Several 
implications  should  be  noted  from  this  finding.  Performance 
appraisal  researchers  should  attempt  to  understand  the 
differences  in  behavioral  versus  categorical  Judgments  (c.f., 
Lord,  1985;  Phillips  &  Lord,  1986).  Schematic  biases  appear  to 
be  strongest  at  the  behavioral  level.  Therefore,  rater  training 
programs  that  focus  on  the  improving  the  quality  of  rating 
through  accurate  recall  of  specific  behaviors  are  not  likely  to 
succeed  (e.g.,  Thorton  &  Zorich,  1979).  A  better  training 
strategy  may  be  to  foous  on  the  supervisor's  initial 
categorization  of  the  ratee.  To  this  end,  frame- of- ref erence 
training  (Bernardln  &  Buckley,  1981)  may  be  the  best  vehicle.  By 
training  supervisors  to  adopt  appropriate  evaluative  schemata, 
their  critical  Initial  categorization  of  the  ratee  should  be  more 
accurate  (Mclntrye  et.  al . ,  1982). 

The  finding  that  raters  receiving  the  good  performance  cue 
were  accurate  recognizing  good  behaviors  may  also  suggest  how 
supervisors  change  their  perceptions  of  workers.  While  the 
Initial  categorization  of  a  worker  is  likely  to  be  resistant  to 
change  (Fiske  &  Taylor,  1981*)  ,  the  supervisor  is  most  likely  to 
attend  and  store  behaviors  Inconsistent  with  the  initial 
categorization  (c.  f . ,  White  A  Carlston,  1983).  Our  results  are 
consistent  with  Feldman's  notion  that  a  worker  exhibiting  an 
unexpected  behavior  will  cause  elaborated  processing  on  the  part 
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of  the  supervisor.  In  terms  of  cognitive  mechanisms  governing 
this  type  of  processing,  the  salience  (Taylor  &  Fiske,  1978)  of 
the  unexpected  behavior  would  attract  the  supervisors'  attention 
and  a  schema- plus- tag  model  of  memory  (Graessar,  Gordon,  & 

Sawyer,  1979)  could  explain  why  it  is  more  available  in  memory. 

This  type  of  cognitive  process  also  has  critical 
implications  for  the  performance  ratings  in  organizations.  The 
major  issue  is  how  much  contradictory  information  is  needed 
before  a  supervisor  can  no  longer  discount  the  information  and 
must  recategorize  (l.e.,  change  their  impression)  the  ratee. 

From  a  rater  training  perspective  this  issue  is  probably  as 
important  as  the  process  involved  in  the  Initial  categorization. 

To  summarize,  the  results  of  the  present  study  advance  our 
understanding  of  information  processing  in  more  realistic 
settings.  Our  results  suggest  that:  1)  the  schema  utilized  by 
the  supervisor  to  form  an  initial  impression  of  the  ratee  is 
critical;  (2)  after  the  initial  categorization  of  the  ratee,  the 
supervisor  is  more  likely  to  use  controlled  processing  when 
attending  to  behaviors  inconsistent  with  the  initial  impression; 
(3)  the  relationship  between  halo  and  accuracy  may  not  generalize 
from  the  laboratory  to  the  organization  setting.  It  is  important 
to  remember  that  in  the  present  study,  raters'  attention  was 
divided  between  only  two  tasks.  Future  research  will  be  able  to 
determine  if  the  patterns  found  in  this  study  become  more 
pronounced  as  information  loads  beoome  heavier. 

In  conclusion,  the  results  of  the  present  research  make  it 
clear  that  future  laboratory  studies  need  to  be  more  realistic. 
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If  not,  researchers  cannot  be  certain  that  the  cognitive 
processes  we  are  tapping  in  the  laboratory  are  similar  to  those 
utilized  in  organizational  settings.  Only  by  incorporating  more 
contextual  variables  into  future  performance  appraisal  can  we 
prevent  the  gap  between  research  and  practice  from  widening. 
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Footnotes 

1  A  recent  study  by  Hauenstein,  Whitcomb,  &  Foti  (1987)  was 
conducted  using  the  same  stimulus  tape  and  a  revised 
ant i proty p ical  recognition  measure.  Pilot  data  for  this  study 
found  much  higher  true  negative  rates  for  the  antiprototypical 
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•Higher  values  indicate  more  lenient  ratings 
•Standard  deviations  appear  in  parentheses 
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Table  2 

A  priori  Comparisons  Between  Performance  Ratings  in  the  Good 
and  Poor  Performance  Cue  Conditions 


Edited 

£ 

t 

Task 

t 

Depth  of 

1.16* 

1.75** 

2.49*** 

Knowledge 

Delivery 

.34 

1.69** 

2.39*** 

Relevance 

.74 

1.80*** 

3.44*** 

Organization 

.50 

1.25** 

1.20** 

Overall 

.19 

1.29** 

1.91*** 
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